Temporal Analysis and Visualization: Leveraging Time Series Capabilities in HVT (Hierarchical Voronoi Tessellation)

Zubin Dowlaty, Chepuri Gopi Krishna, Siddharth Shorya, Pon Anureka Seenivasan, Vishwavani

Created Date: 2023-10-26
Modified Date: 2025-07-10

1. Background

The HVT package is a collection of R functions to facilitate building topology preserving maps for rich multivariate data analysis, see Figure 1 as an example of a 3D torus map generated from the package. Tending towards a big data preponderance, a large number of rows. A collection of R functions for this typical workflow is organized below:

  1. Data Compression: Vector quantization (VQ), HVQ (hierarchical vector quantization) using means or medians. This step compresses the rows (long data frame) using a compression objective.

  2. Data Projection: Dimension projection of the compressed cells to 1D,2D or Interactive surface plot with the Sammons Non-linear Algorithm. This step creates topology preserving map (also called embeddings) coordinates into the desired output dimension.

  3. Tessellation: Create cells required for object visualization using the Voronoi Tessellation method, package includes heatmap plots for hierarchical Voronoi tessellations (HVT). This step enables data insights, visualization, and interaction with the topology preserving map useful for semi-supervised tasks.

  4. Scoring: Scoring new data sets or test data and recording their assignment using the map objects from the above steps, in a sequence of maps if required.

  5. Temporal Analysis and Visualization: Collection of functions that analyzes time series data for its underlying patterns, calculation of transitioning probabilities and the visualizations for the flow of data over time.

  6. Dynamic Forecasting: Simulate future states of dynamic systems using Monte Carlo simulations of Markov Chain (MSM), enabling ex-ante predictions for time-series data.

What’s New?

Below are the new functions and its brief descriptions:

2. Experimental setup

The Lorenz attractor is a three-dimensional figure that is generated by a set of differential equations that model a simple chaotic dynamic system of convective flow. Lorenz Attractor arises from a simplified set of equations that describe the behavior of a system involving three variables. These variables represent the state of the system at any given time and are typically denoted by (x, y, z). The equations are as follows:

\[ dx/dt = σ*(y-x) \] \[ dy/dt = x*(r -z)-y \] \[ dz/dt = x*y-β*z \] where dx/dt, dy/dt, and dz/dt represent the rates of change of x, y, and z respectively over time (t). σ, r and β are constant parameters of the system, with σ(σ = 10) controlling the rate of convection, r(r=28) controlling the difference in temperature between the convective and stable regions, and β(β = 8/3) representing the ratio of the width to the height of the convective layer. When these equations are plotted in three-dimensional space, they produce a chaotic trajectory that never repeats. The Lorenz attractor exhibits sensitive dependence on initial conditions, meaning even small differences in the initial conditions can lead to drastically different trajectories over time. This sensitivity to initial conditions is a defining characteristic of chaotic systems.

In this notebook, we will use the Lorenz Attractor Dataset. This dataset contains 200,000 (Two hundred thousand) observations and 5 columns. The dataset can be downloaded from here.

The dataset includes the following columns:

3. Notebook Requirements

This chunk verifies the installation of all the necessary packages to successfully run this vignette, if not, installs them and attach all the packages in the session environment.

list.of.packages <- c("dplyr", "kableExtra", "plotly", "purrr", "data.table", "gridExtra", "grid", "reactable",
                    "reshape", "tidyr",  "stringr", "DT", "knitr", "feather","HVT","gganimate","gifski")

new.packages <-
  list.of.packages[!(list.of.packages %in% installed.packages()[, "Package"])]
if (length(new.packages))
  install.packages(new.packages, dependencies = TRUE, verbose = FALSE, repos='https://cloud.r-project.org/')
invisible(lapply(list.of.packages, library, character.only = TRUE))

4. Data Understanding

Here, we load the data. Let’s explore the Lorenz Attractor Dataset. For the sake of brevity, we are displaying only the first ten rows.

file_path <- ("./sample_dataset/lorenz_attractor.feather")
dataset <- read_feather(file_path) %>% as.data.frame()
dataset <- dataset %>% select(X,Y,Z,U,t)
dataset$t <- round(dataset$t, 5)
displayTable(head(dataset, 10))
X Y Z U t
0.0000 1.0000 20.0000 0.0000 0.0000
0.0025 0.9998 19.9867 0.0005 0.0003
0.0050 0.9995 19.9734 0.0010 0.0005
0.0075 0.9993 19.9601 0.0015 0.0008
0.0099 0.9990 19.9468 0.0020 0.0010
0.0124 0.9988 19.9335 0.0025 0.0013
0.0149 0.9986 19.9202 0.0030 0.0015
0.0173 0.9984 19.9069 0.0035 0.0018
0.0198 0.9982 19.8937 0.0040 0.0020
0.0222 0.9980 19.8804 0.0045 0.0022

Now, let’s try to visualize the Lorenz attractor (overlapping spirals) in 3D Space.

data_3d <- dataset[sample(1:nrow(dataset), 1000), ]
plot_ly(data_3d, x= ~X, y= ~Y, z = ~Z) %>% add_markers( marker = list(
                          size = 2,
                          symbol = "circle",
                          color = ~Z,
                          colorscale = "Bluered",
                          colorbar = (list(title = 'Z'))))

Figure 1: Lorenz attractor in 3D space

Now, let’s have a look at the structure of the Lorenz Attractor dataset.

str(dataset)
## 'data.frame':    200000 obs. of  5 variables:
##  $ X: num  0 0.0025 0.00499 0.00747 0.00995 ...
##  $ Y: num  1 1 1 0.999 0.999 ...
##  $ Z: num  20 20 20 20 19.9 ...
##  $ U: num  0 0.0005 0.001 0.0015 0.002 ...
##  $ t: num  0 0.00025 0.0005 0.00075 0.001 0.00125 0.0015 0.00175 0.002 0.00225 ...

EDA Plots

This section displays four objects.

Variable Histograms: The histogram distribution of all the features in the dataset.

Box Plots: Box plots for all the features in the dataset. These plots will display the median and Interquartile range of each column at a panel level.

Correlation Matrix: This calculates the Pearson correlation which is a bivariate correlation value measuring the linear correlation between two numeric columns. The output plot is shown as a matrix.

Summary EDA: The table provides descriptive statistics for all the features in the dataset.

Time Series Plots: Plots of all features (including time) against the time column.

It uses an inbuilt function called edaPlots to display the above-mentioned four objects.

NOTE: The input dataset should be a data frame object and the columns should be only numeric type.

Summary Table

edaPlots(dataset)

Histogram

edaPlots(dataset, output_type = 'histogram')

Boxplots

edaPlots(dataset, output_type = 'boxplot')

Correlation Plot

edaPlots(dataset, output_type = 'correlation')

Time Series Plot

edaPlots(dataset, time_column = "t", output_type = "timeseries")


5. Model Constructing and Visualization

The dataset is prepped and ready for constructing the HVT model, which is the first and most prominent step. Model Training involves applying Hierarchical Vector Quantization (HVQ) to iteratively compress and project data into a hierarchy of cells. The process uses a quantization error threshold to determine the number of cells and levels in the hierarchy. The compressed data is then projected onto a 2D space, and the resulting tessellation provides a visual representation of the data distribution, enabling insights into the underlying patterns and relationships.

Model Parameters

NOTE: The compression takes place only for the X, Y and Z coordinates and not for U(velocity) and t(Timestamp). After training & Scoring, we merge back the U and t columns with the dataset.

hvt.results <- trainHVT(
  dataset[,-c(4:5)],
  n_cells = 100,
  depth = 1,
  quant.err = 0.1,
  normalize = TRUE,
  distance_metric = "L1_Norm",
  error_metric = "max",
  quant_method = "kmeans",
  dim_reduction_method = "sammon")

Let’s check out the compression summary.

summary(hvt.results)
segmentLevel noOfCells noOfCellsBelowQuantizationError percentOfCellsBelowQuantizationErrorThreshold parameters
1 100 2 0.02 n_cells: 100 quant.err: 0.1 distance_metric: L1_Norm error_metric: max quant_method: kmeans

NOTE: Based on the provided table, it’s evident that the ‘percentOfCellsBelowQuantizationErrorThreshold’ value is 0.02, indicating that only 2% compression has taken place for the specified number of cells, which is 100. Typically, we would continue increasing the number of cells until at least 80% compression occurs. However, in this vignette demonstration, we’re not doing so, because the plots generated from temporal analysis functions would become cluttered and complex, making explanations less clear.

Now, Let’s plot the Voronoi tessellation for 100 cells.

plotHVT(
  hvt.results,
  centroid.size = c(0.6),
  plot.type = '2Dhvt',
  cell_id = FALSE)
Figure 2: The Voronoi tessellation for layer 1 shown for the 100 cells in the dataset ’Lorenz attractor’

Figure 2: The Voronoi tessellation for layer 1 shown for the 100 cells in the dataset ’Lorenz attractor’

To understand how cell IDs are distributed across the map, we again plot Voronoi tessellation with cell_id = TRUE.

plotHVT(
  hvt.results,
  centroid.size = c(0.6),
  plot.type = '2Dhvt',
  cell_id = TRUE)
Figure 3: The Voronoi tessellation for layer 1 shown for the 100 cells in the dataset ’Lorenz attractor’ with Cell ID

Figure 3: The Voronoi tessellation for layer 1 shown for the 100 cells in the dataset ’Lorenz attractor’ with Cell ID

6. Scoring

Once the model is constructed, the next step is scoring the data points using the trained HVT model. Scoring involves assigning each data point to a specific cell based on the trained model. This process helps map data points to their correct hierarchical cell without the need for forecasting.

scoring <- scoreHVT(dataset,
                    hvt.results,
                    child.level = 1)

Here we are displaying only the first 20 rows of the scored dataset.

summary(scoring)
Row_Number Segment.Level Segment.Parent Segment.Child n Cell.ID Quant.Error centroidRadius diff anomalyFlag X Y Z
1 1 1 88 1 57 0.07 0.3618 0.2918 0 -0.0905 0.0338 -0.3663
2 1 1 88 1 57 0.0694 0.3618 0.2924 0 -0.0902 0.0338 -0.3678
3 1 1 88 1 57 0.0688 0.3618 0.2930 0 -0.0899 0.0337 -0.3693
4 1 1 88 1 57 0.0682 0.3618 0.2936 0 -0.0896 0.0337 -0.3708
5 1 1 88 1 57 0.0676 0.3618 0.2942 0 -0.0893 0.0337 -0.3723
6 1 1 88 1 57 0.0669 0.3618 0.2949 0 -0.0889 0.0336 -0.3738
7 1 1 88 1 57 0.0663 0.3618 0.2955 0 -0.0886 0.0336 -0.3753
8 1 1 88 1 57 0.0657 0.3618 0.2961 0 -0.0883 0.0336 -0.3768
9 1 1 88 1 57 0.0651 0.3618 0.2967 0 -0.0880 0.0336 -0.3783
10 1 1 88 1 57 0.0645 0.3618 0.2973 0 -0.0877 0.0336 -0.3798
11 1 1 88 1 57 0.0639 0.3618 0.2979 0 -0.0874 0.0335 -0.3813
12 1 1 88 1 57 0.0633 0.3618 0.2985 0 -0.0871 0.0335 -0.3828
13 1 1 88 1 57 0.0627 0.3618 0.2991 0 -0.0867 0.0335 -0.3843
14 1 1 88 1 57 0.0621 0.3618 0.2997 0 -0.0864 0.0335 -0.3858
15 1 1 88 1 57 0.0615 0.3618 0.3003 0 -0.0861 0.0334 -0.3872
16 1 1 88 1 57 0.0608 0.3618 0.3009 0 -0.0858 0.0334 -0.3887
17 1 1 88 1 57 0.0602 0.3618 0.3016 0 -0.0855 0.0334 -0.3902
18 1 1 88 1 57 0.0596 0.3618 0.3022 0 -0.0852 0.0334 -0.3917
19 1 1 88 1 57 0.059 0.3618 0.3028 0 -0.0849 0.0334 -0.3932
20 1 1 88 1 57 0.0584 0.3618 0.3034 0 -0.0846 0.0334 -0.3947

These are the brief explanation of all the features in the above table.

Let’s look at the scored model summary from scoreHVT()

scoring$model_info$scored_model_summary
## $input_dataset
## [1] "200000 Rows & 5 Columns"
## 
## $scored_qe_range
## [1] "5e-04 to 0.3433"
## 
## $mad.threshold
## [1] 0.2
## 
## $no_of_anomaly_datapoints
## [1] 1706
## 
## $no_of_anomaly_cells
## [1] 18

The model info displays five attributes which are explained below:

  1. Dimension of the dataset that is used in scoring.
  2. The range of ‘Quantization Error’ after scoring of all the data points of the given dataset. (from minimum to maximum)
  3. The value of Mean Absolute Deviation(MAD) Threshold.
  4. The value of how many data points are anomalous. If the quantization error of a data point is greater than the ‘mad.threshold’, then it is considered as anomaly.
  5. The value of how much cells have anomaly. If a cell has even 1 anomalous data point, the cell will be considered as anomaly_cell.

Now, let’s merge back the U and t columns from the dataset to the scoring output and prepare the dataset for ‘Temporal Analysis’ Functions.

temporal_data <- cbind(scoring$scoredPredictedData, dataset[,c(4,5)]) %>% select(Cell.ID,t)


7. Timeseries plot with State Transitions

The first novel function - plotStateTransition which is used to create a time series plotly object. The plot displays the movement of data points across the cells over time. Below is the function signature and its arguments.

plotStateTransition(
       df,
       sample_size,
       line_plot,
       cellid_column,
       time_column,
       v_intercept,
       time_periods)
plotStateTransition(df = temporal_data, 
                    cellid_column = "Cell.ID", 
                    time_column = "t",
                    sample_size = 0.2,
                    line_plot = TRUE)

For the demo of ‘sample_size’ argument, we are replicating the same plot above with the entire dataset.

plotStateTransition(df = temporal_data, 
                    cellid_column = "Cell.ID", 
                    time_column = "t",
                    sample_size = 1,
                    line_plot = TRUE)


8. Transition probability tables

The second novel function - getTransitionProbability which is used to calculate the transition probability of the current states to the next states including or excluding self-states. Below is the function signature and its arguments.

getTransitionProbability(
        df, 
        cellid_column, 
        time_column)
        type = "with_self_state"
trans_table <- getTransitionProbability(df = temporal_data,cellid_column = "Cell.ID",time_column = "t")
displayTable(trans_table, limit = 325)
Current_State Next_State Relative_Frequency Transition_Probability Cumulative_Probability
1 1 1689 0.9860 0.9860
1 2 16 0.0093 0.9953
1 7 8 0.0047 1.0000
2 2 1297 0.9871 0.9871
2 5 16 0.0122 0.9993
2 7 1 0.0008 1.0000
3 1 24 0.0128 0.0128
3 3 1847 0.9861 0.9989
3 10 2 0.0011 1.0000
4 3 24 0.0140 0.0140
4 4 1692 0.9860 1.0000
5 5 1542 0.9878 0.9878
5 11 5 0.0032 0.9910
5 12 14 0.0090 1.0000
6 4 24 0.0125 0.0125
6 6 1898 0.9865 0.9990
6 9 2 0.0010 1.0000
7 2 1 0.0006 0.0006
7 5 3 0.0019 0.0025
7 7 1584 0.9906 0.9931
7 11 11 0.0069 1.0000
8 6 21 0.0148 0.0148
8 8 1394 0.9852 1.0000
9 3 2 0.0013 0.0013
9 9 1586 0.9925 0.9938
9 10 10 0.0063 1.0000
10 7 6 0.0036 0.0036
10 10 1635 0.9927 0.9963
10 16 6 0.0036 1.0000
11 11 1532 0.9897 0.9897
11 12 6 0.0039 0.9936
11 16 1 0.0006 0.9942
11 18 9 0.0058 1.0000
12 12 1565 0.9874 0.9874
12 18 4 0.0025 0.9899
12 19 16 0.0101 1.0000
13 8 14 0.0092 0.0092
13 13 1506 0.9875 0.9967
13 14 5 0.0033 1.0000
14 6 3 0.0019 0.0019
14 8 7 0.0045 0.0064
14 14 1530 0.9890 0.9954
14 15 7 0.0045 1.0000
15 6 2 0.0013 0.0013
15 9 10 0.0067 0.0080
15 15 1485 0.9920 1.0000
16 16 1454 0.9952 0.9952
16 18 3 0.0021 0.9973
16 24 4 0.0027 1.0000
17 13 16 0.0103 0.0103
17 17 1535 0.9878 0.9981
17 20 3 0.0019 1.0000
18 18 1615 0.9902 0.9902
18 19 3 0.0018 0.9920
18 24 2 0.0012 0.9932
18 25 11 0.0067 1.0000
19 19 1639 0.9885 0.9885
19 25 3 0.0018 0.9903
19 26 16 0.0097 1.0000
20 13 3 0.0018 0.0018
20 14 10 0.0059 0.0077
20 20 1678 0.9911 0.9988
20 21 2 0.0012 1.0000
21 14 2 0.0015 0.0015
21 15 5 0.0037 0.0052
21 20 1 0.0007 0.0059
21 21 1361 0.9942 1.0000
22 17 16 0.0103 0.0103
22 22 1541 0.9897 1.0000
23 17 3 0.0016 0.0016
23 20 11 0.0058 0.0074
23 21 1 0.0005 0.0079
23 22 1 0.0005 0.0084
23 23 1881 0.9916 1.0000
24 24 1452 0.9952 0.9952
24 33 4 0.0027 0.9979
24 36 3 0.0021 1.0000
25 24 1 0.0006 0.0006
25 25 1651 0.9910 0.9916
25 33 9 0.0054 0.9970
25 34 5 0.0030 1.0000
26 25 1 0.0007 0.0007
26 26 1444 0.9890 0.9897
26 34 15 0.0103 1.0000
27 22 7 0.0052 0.0052
27 27 1324 0.9918 0.9970
27 28 4 0.0030 1.0000
28 22 8 0.0052 0.0052
28 23 9 0.0058 0.0110
28 28 1535 0.9890 1.0000
29 21 5 0.0028 0.0028
29 23 3 0.0017 0.0045
29 29 1750 0.9954 1.0000
30 23 4 0.0024 0.0024
30 28 8 0.0047 0.0071
30 29 3 0.0018 0.0089
30 30 1681 0.9912 1.0000
31 27 2 0.0010 0.0010
31 28 5 0.0025 0.0035
31 30 8 0.0040 0.0075
31 31 1993 0.9920 0.9995
31 32 1 0.0005 1.0000
32 27 9 0.0056 0.0056
32 31 1 0.0006 0.0062
32 32 1586 0.9937 1.0000
33 33 1838 0.9903 0.9903
33 36 5 0.0027 0.9930
33 42 9 0.0048 0.9978
33 43 4 0.0022 1.0000
34 33 5 0.0028 0.0028
34 34 1768 0.9888 0.9916
34 43 15 0.0084 1.0000
35 29 3 0.0018 0.0018
35 30 5 0.0030 0.0048
35 35 1639 0.9945 0.9993
35 38 1 0.0006 1.0000
36 36 1603 0.9950 0.9950
36 41 6 0.0037 0.9987
36 42 2 0.0012 1.0000
37 29 2 0.0011 0.0011
37 35 4 0.0021 0.0032
37 37 1862 0.9968 1.0000
38 30 2 0.0010 0.0010
38 31 8 0.0040 0.0050
38 35 2 0.0010 0.0060
38 38 1990 0.9940 1.0000
39 31 7 0.0035 0.0035
39 38 3 0.0015 0.0050
39 39 1971 0.9950 1.0000
40 32 9 0.0046 0.0046
40 40 1934 0.9954 1.0000
41 37 2 0.0011 0.0011
41 41 1781 0.9961 0.9972
41 45 4 0.0022 0.9994
41 52 1 0.0006 1.0000
42 42 1842 0.9903 0.9903
42 49 11 0.0059 0.9962
42 54 7 0.0038 1.0000
43 42 7 0.0037 0.0037
43 43 1898 0.9901 0.9938
43 54 12 0.0063 1.0000
44 35 3 0.0010 0.0010
44 38 8 0.0027 0.0037
44 44 2917 0.9962 1.0000
45 37 3 0.0016 0.0016
45 44 5 0.0026 0.0042
45 45 1896 0.9937 0.9979
45 51 4 0.0021 1.0000
46 37 1 0.0011 0.0011
46 45 1 0.0011 0.0022
46 46 869 0.9954 0.9976
46 56 2 0.0023 1.0000
47 39 10 0.0042 0.0042
47 47 2374 0.9958 1.0000
48 40 9 0.0037 0.0037
48 48 2417 0.9963 1.0000
49 41 1 0.0005 0.0005
49 49 2079 0.9928 0.9933
49 52 10 0.0048 0.9981
49 61 4 0.0019 1.0000
50 44 1 0.0004 0.0004
50 47 10 0.0039 0.0043
50 50 2571 0.9950 0.9993
50 58 2 0.0008 1.0000
51 44 5 0.0017 0.0017
51 45 1 0.0003 0.0020
51 50 5 0.0017 0.0037
51 51 2969 0.9926 0.9963
51 57 11 0.0037 1.0000
52 45 4 0.0017 0.0017
52 52 2380 0.9942 0.9959
52 60 10 0.0042 1.0000
53 48 9 0.0036 0.0036
53 53 2485 0.9964 1.0000
54 49 4 0.0019 0.0019
54 54 2090 0.9910 0.9929
54 61 15 0.0071 1.0000
55 46 2 0.0029 0.0029
55 55 683 0.9927 0.9956
55 59 3 0.0044 1.0000
56 45 2 0.0007 0.0007
56 51 11 0.0037 0.0044
56 56 2990 0.9947 0.9991
56 57 2 0.0007 0.9998
56 60 1 0.0003 1.0000
57 50 8 0.0025 0.0025
57 51 1 0.0003 0.0028
57 57 3209 0.9941 0.9969
57 58 9 0.0028 0.9997
57 67 1 0.0003 1.0000
58 53 9 0.0036 0.0036
58 58 2489 0.9952 0.9988
58 66 3 0.0012 1.0000
59 46 2 0.0008 0.0008
59 56 14 0.0053 0.0061
59 59 2633 0.9940 1.0000
60 51 6 0.0026 0.0026
60 57 3 0.0013 0.0039
60 60 2315 0.9914 0.9953
60 63 10 0.0043 0.9996
60 69 1 0.0004 1.0000
61 52 3 0.0012 0.0012
61 61 2450 0.9923 0.9935
61 64 16 0.0065 1.0000
62 55 2 0.0009 0.0009
62 59 11 0.0049 0.0058
62 62 2230 0.9911 0.9969
62 68 7 0.0031 1.0000
63 57 2 0.0006 0.0006
63 63 3339 0.9946 0.9952
63 67 16 0.0048 1.0000
64 60 7 0.0028 0.0028
64 64 2442 0.9935 0.9963
64 69 9 0.0037 1.0000
65 60 2 0.0008 0.0008
65 63 3 0.0012 0.0020
65 65 2484 0.9956 0.9976
65 69 6 0.0024 1.0000
66 66 2706 0.9978 0.9978
66 70 3 0.0011 0.9989
66 72 3 0.0011 1.0000
67 58 1 0.0003 0.0003
67 66 2 0.0005 0.0008
67 67 3838 0.9956 0.9964
67 70 14 0.0036 1.0000
68 59 2 0.0009 0.0009
68 65 11 0.0048 0.0057
68 68 2274 0.9943 1.0000
69 63 5 0.0019 0.0019
69 69 2663 0.9940 0.9959
69 73 11 0.0041 1.0000
70 66 1 0.0003 0.0003
70 70 3599 0.9953 0.9956
70 75 16 0.0044 1.0000
71 55 3 0.0015 0.0015
71 62 12 0.0059 0.0074
71 71 2004 0.9911 0.9985
71 74 3 0.0015 1.0000
72 72 1693 0.9982 0.9982
72 78 3 0.0018 1.0000
73 73 2653 0.9959 0.9959
73 76 11 0.0041 1.0000
74 62 8 0.0038 0.0038
74 68 6 0.0028 0.0066
74 74 2096 0.9934 1.0000
75 75 3746 0.9957 0.9957
75 76 2 0.0005 0.9962
75 78 3 0.0008 0.9970
75 79 11 0.0029 1.0000
76 76 2746 0.9953 0.9953
76 79 3 0.0011 0.9964
76 81 10 0.0036 1.0000
77 71 18 0.0094 0.0094
77 77 1888 0.9906 1.0000
78 78 1369 0.9956 0.9956
78 79 3 0.0022 0.9978
78 85 3 0.0022 1.0000
79 79 3030 0.9944 0.9944
79 81 4 0.0013 0.9957
79 82 10 0.0033 0.9990
79 85 3 0.0010 1.0000
80 74 11 0.0058 0.0058
80 77 3 0.0016 0.0074
80 80 1876 0.9926 1.0000
81 81 2382 0.9942 0.9942
81 82 5 0.0021 0.9963
81 84 9 0.0038 1.0000
82 82 2345 0.9924 0.9924
82 84 6 0.0025 0.9949
82 88 12 0.0051 1.0000
83 77 15 0.0079 0.0079
83 80 4 0.0021 0.0100
83 83 1874 0.9900 1.0000
84 84 1964 0.9924 0.9924
84 87 11 0.0056 0.9980
84 88 4 0.0020 1.0000
85 82 3 0.0032 0.0032
85 85 931 0.9947 0.9979
85 88 2 0.0021 1.0000
86 80 10 0.0054 0.0054
86 83 8 0.0043 0.0097
86 86 1840 0.9903 1.0000
87 87 1946 0.9888 0.9888
87 90 11 0.0056 0.9944
87 91 11 0.0056 1.0000
88 87 11 0.0051 0.0051
88 88 2138 0.9917 0.9968
88 91 7 0.0032 1.0000
89 83 11 0.0061 0.0061
89 86 7 0.0039 0.0100
89 89 1784 0.9900 1.0000
90 90 2129 0.9889 0.9889
90 93 11 0.0051 0.9940
90 94 13 0.0060 1.0000
91 90 13 0.0068 0.0068
91 91 1881 0.9905 0.9973
91 94 5 0.0026 1.0000
92 86 11 0.0063 0.0063
92 89 1 0.0006 0.0069
92 92 1735 0.9920 0.9989
92 95 2 0.0011 1.0000
93 93 1965 0.9884 0.9884
93 97 11 0.0055 0.9939
93 99 12 0.0060 1.0000
94 93 12 0.0073 0.0073
94 94 1632 0.9891 0.9964
94 99 6 0.0036 1.0000
95 89 17 0.0097 0.0097
95 92 2 0.0011 0.0108
95 95 1734 0.9892 1.0000
96 92 11 0.0065 0.0065
96 96 1672 0.9905 0.9970
96 98 5 0.0030 1.0000
97 96 11 0.0058 0.0058
97 97 1859 0.9883 0.9941
97 100 11 0.0058 1.0000
98 92 1 0.0005 0.0005
98 95 17 0.0093 0.0098
98 98 1814 0.9902 1.0000
99 97 11 0.0068 0.0068
99 99 1606 0.9889 0.9957
99 100 7 0.0043 1.0000
100 96 5 0.0028 0.0028
100 98 13 0.0074 0.0102
100 100 1737 0.9897 1.0000

9. Reconciling transition probability using markovchain package

The third novel function - reconcileTransitionProbability which is used to reconcile the transition probability of the current states to the next states manually and markovchain function considering self states and without self states. Below is the function signature and its arguments.

reconcileTransitionProbability(
                df, 
                hmap_type = "All", 
                cellid_column, 
                time_column)
reconcile_plots <- reconcileTransitionProbability(df = temporal_data, 
                                                  hmap_type = "All", 
                                                  cellid_column = "Cell.ID",
                                                  time_column = "t")

Reconciliation plots of transition probability with self_state

The transition probability of one state staying in the same state is calculated using manual calculations and the markovchain function is plotted for comparison. The darker diagonal cells indicate higher probabilities of cells staying in the same state.

reconcile_plots[[1]]

Reconciliation table of transition probability with self-state

displayTable(reconcile_plots[[2]], limit = 217)
Current_State Next_State_manual Next_State_markov Probability_manual_calculation Probability_markov_function diff
1 2 2 0.0093 0.0093 0
1 7 7 0.0047 0.0047 0
2 5 5 0.0122 0.0122 0
2 7 7 0.0008 0.0008 0
3 1 1 0.0128 0.0128 0
3 10 10 0.0011 0.0011 0
4 3 3 0.0140 0.0140 0
5 11 11 0.0032 0.0032 0
5 12 12 0.0090 0.0090 0
6 4 4 0.0125 0.0125 0
6 9 9 0.0010 0.0010 0
7 2 2 0.0006 0.0006 0
7 5 5 0.0019 0.0019 0
7 11 11 0.0069 0.0069 0
8 6 6 0.0148 0.0148 0
9 3 3 0.0013 0.0013 0
9 10 10 0.0063 0.0063 0
10 7 7 0.0036 0.0036 0
10 16 16 0.0036 0.0036 0
11 12 12 0.0039 0.0039 0
11 16 16 0.0006 0.0006 0
11 18 18 0.0058 0.0058 0
12 18 18 0.0025 0.0025 0
12 19 19 0.0101 0.0101 0
13 8 8 0.0092 0.0092 0
13 14 14 0.0033 0.0033 0
14 6 6 0.0019 0.0019 0
14 8 8 0.0045 0.0045 0
14 15 15 0.0045 0.0045 0
15 6 6 0.0013 0.0013 0
15 9 9 0.0067 0.0067 0
16 18 18 0.0021 0.0021 0
16 24 24 0.0027 0.0027 0
17 13 13 0.0103 0.0103 0
17 20 20 0.0019 0.0019 0
18 19 19 0.0018 0.0018 0
18 24 24 0.0012 0.0012 0
18 25 25 0.0067 0.0067 0
19 25 25 0.0018 0.0018 0
19 26 26 0.0097 0.0097 0
20 13 13 0.0018 0.0018 0
20 14 14 0.0059 0.0059 0
20 21 21 0.0012 0.0012 0
21 14 14 0.0015 0.0015 0
21 15 15 0.0037 0.0037 0
21 20 20 0.0007 0.0007 0
22 17 17 0.0103 0.0103 0
23 17 17 0.0016 0.0016 0
23 20 20 0.0058 0.0058 0
23 21 21 0.0005 0.0005 0
23 22 22 0.0005 0.0005 0
24 33 33 0.0027 0.0027 0
24 36 36 0.0021 0.0021 0
25 24 24 0.0006 0.0006 0
25 33 33 0.0054 0.0054 0
25 34 34 0.0030 0.0030 0
26 25 25 0.0007 0.0007 0
26 34 34 0.0103 0.0103 0
27 22 22 0.0052 0.0052 0
27 28 28 0.0030 0.0030 0
28 22 22 0.0052 0.0052 0
28 23 23 0.0058 0.0058 0
29 21 21 0.0028 0.0028 0
29 23 23 0.0017 0.0017 0
30 23 23 0.0024 0.0024 0
30 28 28 0.0047 0.0047 0
30 29 29 0.0018 0.0018 0
31 27 27 0.0010 0.0010 0
31 28 28 0.0025 0.0025 0
31 30 30 0.0040 0.0040 0
31 32 32 0.0005 0.0005 0
32 27 27 0.0056 0.0056 0
32 31 31 0.0006 0.0006 0
33 36 36 0.0027 0.0027 0
33 42 42 0.0048 0.0048 0
33 43 43 0.0022 0.0022 0
34 33 33 0.0028 0.0028 0
34 43 43 0.0084 0.0084 0
35 29 29 0.0018 0.0018 0
35 30 30 0.0030 0.0030 0
35 38 38 0.0006 0.0006 0
36 41 41 0.0037 0.0037 0
36 42 42 0.0012 0.0012 0
37 29 29 0.0011 0.0011 0
37 35 35 0.0021 0.0021 0
38 30 30 0.0010 0.0010 0
38 31 31 0.0040 0.0040 0
38 35 35 0.0010 0.0010 0
39 31 31 0.0035 0.0035 0
39 38 38 0.0015 0.0015 0
40 32 32 0.0046 0.0046 0
41 37 37 0.0011 0.0011 0
41 45 45 0.0022 0.0022 0
41 52 52 0.0006 0.0006 0
42 49 49 0.0059 0.0059 0
42 54 54 0.0038 0.0038 0
43 42 42 0.0037 0.0037 0
43 54 54 0.0063 0.0063 0
44 35 35 0.0010 0.0010 0
44 38 38 0.0027 0.0027 0
45 37 37 0.0016 0.0016 0
45 44 44 0.0026 0.0026 0
45 51 51 0.0021 0.0021 0
46 37 37 0.0011 0.0011 0
46 45 45 0.0011 0.0011 0
46 56 56 0.0023 0.0023 0
47 39 39 0.0042 0.0042 0
48 40 40 0.0037 0.0037 0
49 41 41 0.0005 0.0005 0
49 52 52 0.0048 0.0048 0
49 61 61 0.0019 0.0019 0
50 44 44 0.0004 0.0004 0
50 47 47 0.0039 0.0039 0
50 58 58 0.0008 0.0008 0
51 44 44 0.0017 0.0017 0
51 45 45 0.0003 0.0003 0
51 50 50 0.0017 0.0017 0
51 57 57 0.0037 0.0037 0
52 45 45 0.0017 0.0017 0
52 60 60 0.0042 0.0042 0
53 48 48 0.0036 0.0036 0
54 49 49 0.0019 0.0019 0
54 61 61 0.0071 0.0071 0
55 46 46 0.0029 0.0029 0
55 59 59 0.0044 0.0044 0
56 45 45 0.0007 0.0007 0
56 51 51 0.0037 0.0037 0
56 57 57 0.0007 0.0007 0
56 60 60 0.0003 0.0003 0
57 50 50 0.0025 0.0025 0
57 51 51 0.0003 0.0003 0
57 58 58 0.0028 0.0028 0
57 67 67 0.0003 0.0003 0
58 53 53 0.0036 0.0036 0
58 66 66 0.0012 0.0012 0
59 46 46 0.0008 0.0008 0
59 56 56 0.0053 0.0053 0
60 51 51 0.0026 0.0026 0
60 57 57 0.0013 0.0013 0
60 63 63 0.0043 0.0043 0
60 69 69 0.0004 0.0004 0
61 52 52 0.0012 0.0012 0
61 64 64 0.0065 0.0065 0
62 55 55 0.0009 0.0009 0
62 59 59 0.0049 0.0049 0
62 68 68 0.0031 0.0031 0
63 57 57 0.0006 0.0006 0
63 67 67 0.0048 0.0048 0
64 60 60 0.0028 0.0028 0
64 69 69 0.0037 0.0037 0
65 60 60 0.0008 0.0008 0
65 63 63 0.0012 0.0012 0
65 69 69 0.0024 0.0024 0
66 70 70 0.0011 0.0011 0
66 72 72 0.0011 0.0011 0
67 58 58 0.0003 0.0003 0
67 66 66 0.0005 0.0005 0
67 70 70 0.0036 0.0036 0
68 59 59 0.0009 0.0009 0
68 65 65 0.0048 0.0048 0
69 63 63 0.0019 0.0019 0
69 73 73 0.0041 0.0041 0
70 66 66 0.0003 0.0003 0
70 75 75 0.0044 0.0044 0
71 55 55 0.0015 0.0015 0
71 62 62 0.0059 0.0059 0
71 74 74 0.0015 0.0015 0
72 78 78 0.0018 0.0018 0
73 76 76 0.0041 0.0041 0
74 62 62 0.0038 0.0038 0
74 68 68 0.0028 0.0028 0
75 76 76 0.0005 0.0005 0
75 78 78 0.0008 0.0008 0
75 79 79 0.0029 0.0029 0
76 79 79 0.0011 0.0011 0
76 81 81 0.0036 0.0036 0
77 71 71 0.0094 0.0094 0
78 79 79 0.0022 0.0022 0
78 85 85 0.0022 0.0022 0
79 81 81 0.0013 0.0013 0
79 82 82 0.0033 0.0033 0
79 85 85 0.0010 0.0010 0
80 74 74 0.0058 0.0058 0
80 77 77 0.0016 0.0016 0
81 82 82 0.0021 0.0021 0
81 84 84 0.0038 0.0038 0
82 84 84 0.0025 0.0025 0
82 88 88 0.0051 0.0051 0
83 77 77 0.0079 0.0079 0
83 80 80 0.0021 0.0021 0
84 87 87 0.0056 0.0056 0
84 88 88 0.0020 0.0020 0
85 82 82 0.0032 0.0032 0
85 88 88 0.0021 0.0021 0
86 80 80 0.0054 0.0054 0
86 83 83 0.0043 0.0043 0
87 90 90 0.0056 0.0056 0
87 91 91 0.0056 0.0056 0
88 87 87 0.0051 0.0051 0
88 91 91 0.0032 0.0032 0
89 83 83 0.0061 0.0061 0
89 86 86 0.0039 0.0039 0
90 93 93 0.0051 0.0051 0
90 94 94 0.0060 0.0060 0
91 90 90 0.0068 0.0068 0
91 94 94 0.0026 0.0026 0
92 86 86 0.0063 0.0063 0
92 89 89 0.0006 0.0006 0
92 95 95 0.0011 0.0011 0
93 97 97 0.0055 0.0055 0
93 99 99 0.0060 0.0060 0
94 93 93 0.0073 0.0073 0
94 99 99 0.0036 0.0036 0
95 89 89 0.0097 0.0097 0
95 92 92 0.0011 0.0011 0
96 92 92 0.0065 0.0065 0
96 98 98 0.0030 0.0030 0

Reconciliation plots of transition probability without self-state

The transition probability of one state moving to the next state is calculated using manual calculations and the markovchain function and plotted for comparison. From all the next state transitions, the one with a higher probability is selected.

reconcile_plots[[3]]

Reconciliation table of transition probability without self-state

displayTable(reconcile_plots[[4]], limit = 217)
Current_State Next_State_manual Next_State_markov Probability_manual_calculation Probability_markov_function diff
1 2 2 0.0093 0.0093 0
1 7 7 0.0047 0.0047 0
2 5 5 0.0122 0.0122 0
2 7 7 0.0008 0.0008 0
3 1 1 0.0128 0.0128 0
3 10 10 0.0011 0.0011 0
4 3 3 0.0140 0.0140 0
5 11 11 0.0032 0.0032 0
5 12 12 0.0090 0.0090 0
6 4 4 0.0125 0.0125 0
6 9 9 0.0010 0.0010 0
7 2 2 0.0006 0.0006 0
7 5 5 0.0019 0.0019 0
7 11 11 0.0069 0.0069 0
8 6 6 0.0148 0.0148 0
9 3 3 0.0013 0.0013 0
9 10 10 0.0063 0.0063 0
10 7 7 0.0036 0.0036 0
10 16 16 0.0036 0.0036 0
11 12 12 0.0039 0.0039 0
11 16 16 0.0006 0.0006 0
11 18 18 0.0058 0.0058 0
12 18 18 0.0025 0.0025 0
12 19 19 0.0101 0.0101 0
13 8 8 0.0092 0.0092 0
13 14 14 0.0033 0.0033 0
14 6 6 0.0019 0.0019 0
14 8 8 0.0045 0.0045 0
14 15 15 0.0045 0.0045 0
15 6 6 0.0013 0.0013 0
15 9 9 0.0067 0.0067 0
16 18 18 0.0021 0.0021 0
16 24 24 0.0027 0.0027 0
17 13 13 0.0103 0.0103 0
17 20 20 0.0019 0.0019 0
18 19 19 0.0018 0.0018 0
18 24 24 0.0012 0.0012 0
18 25 25 0.0067 0.0067 0
19 25 25 0.0018 0.0018 0
19 26 26 0.0097 0.0097 0
20 13 13 0.0018 0.0018 0
20 14 14 0.0059 0.0059 0
20 21 21 0.0012 0.0012 0
21 14 14 0.0015 0.0015 0
21 15 15 0.0037 0.0037 0
21 20 20 0.0007 0.0007 0
22 17 17 0.0103 0.0103 0
23 17 17 0.0016 0.0016 0
23 20 20 0.0058 0.0058 0
23 21 21 0.0005 0.0005 0
23 22 22 0.0005 0.0005 0
24 33 33 0.0027 0.0027 0
24 36 36 0.0021 0.0021 0
25 24 24 0.0006 0.0006 0
25 33 33 0.0054 0.0054 0
25 34 34 0.0030 0.0030 0
26 25 25 0.0007 0.0007 0
26 34 34 0.0103 0.0103 0
27 22 22 0.0052 0.0052 0
27 28 28 0.0030 0.0030 0
28 22 22 0.0052 0.0052 0
28 23 23 0.0058 0.0058 0
29 21 21 0.0028 0.0028 0
29 23 23 0.0017 0.0017 0
30 23 23 0.0024 0.0024 0
30 28 28 0.0047 0.0047 0
30 29 29 0.0018 0.0018 0
31 27 27 0.0010 0.0010 0
31 28 28 0.0025 0.0025 0
31 30 30 0.0040 0.0040 0
31 32 32 0.0005 0.0005 0
32 27 27 0.0056 0.0056 0
32 31 31 0.0006 0.0006 0
33 36 36 0.0027 0.0027 0
33 42 42 0.0048 0.0048 0
33 43 43 0.0022 0.0022 0
34 33 33 0.0028 0.0028 0
34 43 43 0.0084 0.0084 0
35 29 29 0.0018 0.0018 0
35 30 30 0.0030 0.0030 0
35 38 38 0.0006 0.0006 0
36 41 41 0.0037 0.0037 0
36 42 42 0.0012 0.0012 0
37 29 29 0.0011 0.0011 0
37 35 35 0.0021 0.0021 0
38 30 30 0.0010 0.0010 0
38 31 31 0.0040 0.0040 0
38 35 35 0.0010 0.0010 0
39 31 31 0.0035 0.0035 0
39 38 38 0.0015 0.0015 0
40 32 32 0.0046 0.0046 0
41 37 37 0.0011 0.0011 0
41 45 45 0.0022 0.0022 0
41 52 52 0.0006 0.0006 0
42 49 49 0.0059 0.0059 0
42 54 54 0.0038 0.0038 0
43 42 42 0.0037 0.0037 0
43 54 54 0.0063 0.0063 0
44 35 35 0.0010 0.0010 0
44 38 38 0.0027 0.0027 0
45 37 37 0.0016 0.0016 0
45 44 44 0.0026 0.0026 0
45 51 51 0.0021 0.0021 0
46 37 37 0.0011 0.0011 0
46 45 45 0.0011 0.0011 0
46 56 56 0.0023 0.0023 0
47 39 39 0.0042 0.0042 0
48 40 40 0.0037 0.0037 0
49 41 41 0.0005 0.0005 0
49 52 52 0.0048 0.0048 0
49 61 61 0.0019 0.0019 0
50 44 44 0.0004 0.0004 0
50 47 47 0.0039 0.0039 0
50 58 58 0.0008 0.0008 0
51 44 44 0.0017 0.0017 0
51 45 45 0.0003 0.0003 0
51 50 50 0.0017 0.0017 0
51 57 57 0.0037 0.0037 0
52 45 45 0.0017 0.0017 0
52 60 60 0.0042 0.0042 0
53 48 48 0.0036 0.0036 0
54 49 49 0.0019 0.0019 0
54 61 61 0.0071 0.0071 0
55 46 46 0.0029 0.0029 0
55 59 59 0.0044 0.0044 0
56 45 45 0.0007 0.0007 0
56 51 51 0.0037 0.0037 0
56 57 57 0.0007 0.0007 0
56 60 60 0.0003 0.0003 0
57 50 50 0.0025 0.0025 0
57 51 51 0.0003 0.0003 0
57 58 58 0.0028 0.0028 0
57 67 67 0.0003 0.0003 0
58 53 53 0.0036 0.0036 0
58 66 66 0.0012 0.0012 0
59 46 46 0.0008 0.0008 0
59 56 56 0.0053 0.0053 0
60 51 51 0.0026 0.0026 0
60 57 57 0.0013 0.0013 0
60 63 63 0.0043 0.0043 0
60 69 69 0.0004 0.0004 0
61 52 52 0.0012 0.0012 0
61 64 64 0.0065 0.0065 0
62 55 55 0.0009 0.0009 0
62 59 59 0.0049 0.0049 0
62 68 68 0.0031 0.0031 0
63 57 57 0.0006 0.0006 0
63 67 67 0.0048 0.0048 0
64 60 60 0.0028 0.0028 0
64 69 69 0.0037 0.0037 0
65 60 60 0.0008 0.0008 0
65 63 63 0.0012 0.0012 0
65 69 69 0.0024 0.0024 0
66 70 70 0.0011 0.0011 0
66 72 72 0.0011 0.0011 0
67 58 58 0.0003 0.0003 0
67 66 66 0.0005 0.0005 0
67 70 70 0.0036 0.0036 0
68 59 59 0.0009 0.0009 0
68 65 65 0.0048 0.0048 0
69 63 63 0.0019 0.0019 0
69 73 73 0.0041 0.0041 0
70 66 66 0.0003 0.0003 0
70 75 75 0.0044 0.0044 0
71 55 55 0.0015 0.0015 0
71 62 62 0.0059 0.0059 0
71 74 74 0.0015 0.0015 0
72 78 78 0.0018 0.0018 0
73 76 76 0.0041 0.0041 0
74 62 62 0.0038 0.0038 0
74 68 68 0.0028 0.0028 0
75 76 76 0.0005 0.0005 0
75 78 78 0.0008 0.0008 0
75 79 79 0.0029 0.0029 0
76 79 79 0.0011 0.0011 0
76 81 81 0.0036 0.0036 0
77 71 71 0.0094 0.0094 0
78 79 79 0.0022 0.0022 0
78 85 85 0.0022 0.0022 0
79 81 81 0.0013 0.0013 0
79 82 82 0.0033 0.0033 0
79 85 85 0.0010 0.0010 0
80 74 74 0.0058 0.0058 0
80 77 77 0.0016 0.0016 0
81 82 82 0.0021 0.0021 0
81 84 84 0.0038 0.0038 0
82 84 84 0.0025 0.0025 0
82 88 88 0.0051 0.0051 0
83 77 77 0.0079 0.0079 0
83 80 80 0.0021 0.0021 0
84 87 87 0.0056 0.0056 0
84 88 88 0.0020 0.0020 0
85 82 82 0.0032 0.0032 0
85 88 88 0.0021 0.0021 0
86 80 80 0.0054 0.0054 0
86 83 83 0.0043 0.0043 0
87 90 90 0.0056 0.0056 0
87 91 91 0.0056 0.0056 0
88 87 87 0.0051 0.0051 0
88 91 91 0.0032 0.0032 0
89 83 83 0.0061 0.0061 0
89 86 86 0.0039 0.0039 0
90 93 93 0.0051 0.0051 0
90 94 94 0.0060 0.0060 0
91 90 90 0.0068 0.0068 0
91 94 94 0.0026 0.0026 0
92 86 86 0.0063 0.0063 0
92 89 89 0.0006 0.0006 0
92 95 95 0.0011 0.0011 0
93 97 97 0.0055 0.0055 0
93 99 99 0.0060 0.0060 0
94 93 93 0.0073 0.0073 0
94 99 99 0.0036 0.0036 0
95 89 89 0.0097 0.0097 0
95 92 92 0.0011 0.0011 0
96 92 92 0.0065 0.0065 0
96 98 98 0.0030 0.0030 0


10. Flowmaps

The fourth novel function - plotAnimatedFlowmap which is used to create flowmaps and animations for the highest transition probability including and excluding self-states. Below is the function signature and its arguments.

plotAnimatedFlowmap(
         hvt_model_output, 
         transition_probability_df, 
         df, 
         flow_map = "All", 
         cellid_column, 
         time_column )
flowmap_plots <- plotAnimatedFlowmap(hvt_model_output = hvt.results,
                                     transition_probability_df =trans_table,
                                     df = temporal_data, 
                                     flow_map = 'All',
                                     cellid_column = "Cell.ID", 
                                     time_column = "t")


1. Flow map: Highest transition probability including self-state

The Circle size around the cell’s centroid represents self-state probability. More size, more probability of staying in the same cell.

flowmap_plots[[1]]


2. Flow map: Highest transition probability excluding self-states: Arrow size represents transition probability

The arrow size represents the Probability of the data to move to the next state. And the arrow directions point out to which cell it is moving next.

flowmap_plots[[2]]